# Full text of "Decision Approach and Empirical Bayes FCR-Controlling Interval for Mixed Prior Model"

## See other formats

lElect ronic Journal of Statistics Vol. (2009) ISSN: 1935-7524 Decision Approach and Empirical Bayes FCR-Controlling Interval for Mixed Prior Model Zhigen Zhao e-mail: izhigen . zhao@gmail ■ com] Abstract: In this paper, I apply the decision theory and empirical Bayesian approach to construct confidence intervals for selected populations when true parameters follow a mixture prior distribution. A loss function with two tuning parameters ki and k2 is coined to address the mixture prior. One specific choice of k2 can load to the procedure in Qiu and Hwang (2007); the other choice of k2 provides an interval construction which controls the Bayes FCR. Both the analytical and extensive numerical simulation stud- ies demonstrate that the new empirical Bayesian FCR controlling approach enjoys great length reduction. At the end, I apply different methods to a microarray data set. It turns out that the average length of the new ap- proach is only 57% of that of Qiu and Hwang's procedure which controls the simultaneous non-coverage probability and 66% of that of Benjamini and Yekutieli (2005)'s procedure which controls the frequcntist's FCR. AMS 2000 subject classifications: Decision Bayes, Loss Function, Si- multaneous Intervals.. 1. Introduction Simultaneous interval estimation for a large number of selected parameters is challenging especially when the number of observations for each parameter is very small. The difficulties arc the selection bias (sec Qiu and Hwang 2007 and Hwang 1993) and the multiplicity. The traditional approach, which treats all the parameters as fixed, seems to have little power when the dimension tends to be very large, for instance, several thousands in microarray. However, the empirical Bayesian approach is known to be able to borrow strength across the populations. Thus, it is very likely that this method will provide us with some satisfactory procedures. In the past, people attempted to estimate the parameters for selection pop- ulations (see, for example, Cohen and Sackrowitz 1982 and Hwang 1993). How- ever, very few people knew how to construct an interval for selected population. The first exciting work was written by Benjamini and Yekutieli (2005) (I will use B-Y (2005) to represent this work throughout the paper). They adapted the concept of FDR from multiple testing and coined a concept False Coverage Rate (FCR) for simultaneous intervals. This criterion is much less conservative *This is an original survey paper imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 1 than the simultaneous non-coverage coefficient. They have constructed confi- dence intervals for multiple selected parameters which can control the FCR at specified g-level, typically 5%. They centered their intervals upon the estimator Xi's which are biased for selected populations and addressed the multiplicity by lengthening the intervals. Consequently, their intervals have extremely large average half length. In 2008, Zhao and Hwang introduced the Bayes FCR and connected Bayes confidence interval which aims at controlling Bayesian non-coverage coefficients with the Bayesian FCR controlling intervals. They applied this general theorem to the normal-normal setting where the observations follow a normal distribution with unequal but know variances and the parameters follow a normal prior. They used the empirical Bayesian approach to derive explicit intervals which can control the empirical Bayes FCR. Their construction reduced the average length of B-Y's procedure dramatically because they addressed the multiplicity by modifying the centers instead of the lengths. Another exciting work is Qiu and Hwang (2007), which offers a way to con- struct intervals that can control the simultaneous coverage coefficient for selected popultions. Other than the normal-normal model, they treated the so-called normal-mixture model where the prior distribution of the true parameters is a mixture of a normal random variable with an equal, known variance and a single point zero. Because they have addressed the multiplicity by Bonferroni's correction, their lengths tend to be large when many parameters are selected. In this paper, I use the decision approach and empirical Bayes to construct intervals for selected populations under the same model setting of Qiu and Hwang (2007). Application of decision approach to interval/set estimation has a long history which dates back to Faith (1976), Casella and Hwang (1983), and He (1992). Recently, Hwang, Qiu, and Zhao (2008) have constructed the double shrinkage empirical confidence interval for one single parameter when assuming the variances to be unequal and unknown. However, all the loss functions they have used are not appropriate under the mixed prior model (Detailed argument is in section [2. 2p . Thus a new loss function with two tuning parameters fci and fc2 is proposed and strongly recommended. One specific choice of k2 results in Qiu and Hwang (2007) 's procedure. The other choice of k2 provides us with a way to construct the empirical Bayesian FCR-controUing intervals based on the normal-mixture model. In section [2j I introduce the model setting and the decision Bayes rule based on our new loss function. In section [3l I will connect the decision Bayesian rule with Qiu and Hwang (2007) 's procedure first and then derive a procedure which can control the Bayes FCR. In section [U empirical Bayesian approach is constructed and evaluated both numerically and analytically. In section [51 I apply the confidence intervals constructed in section |4] to a real microarray dataset and compare it with B-Y (2005) 's and Qiu and Hwang's procedures. It turns out that my procedure out-performs theirs. The average length of my interval is only 57% of that of Qiu and Hwang's (2007) procedure which controls the simultaneous coverage probability and 66% of that of B-Y (2005) 's procedure which controls the frequentist's FCR. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 2 2. Normal-Mixture Model for the means 2.1. Model Assumption In microarray, it is generally assumed that observed differentially expressed levels Xi^s are normally distributed with true means ^^'s, i = 1,2, ■ ■ ■ ,p, where the dimension p varies from several thousands to over thirty thousands. Due to the extremely large number of dimensions, it is strongly recommended to use a prior to model the true means ^i's. A natural choice is the normal prior where O, "^''- N{0,T^). However, in Qiu and Hwang (2007), they have applied the Q-Q plot to a microarray data and shown that normal-normal model cannot fit the data well. To remedy this, they introduced the normal-mixture model as following. Assume that Xi\9i ^ N{e^,a^), and , , J =0 with probability ttq, \ - A^(0, r2) with probability tti = 1 - ttq. ' I use an indicator function li to describe whether di is 0, i.e. = if di = and /i = 1 if ~ -^(0, t^). Initially, I assume that hyper parameters and ttq are known and derive the corresponding decision Bayesian procedure. In section m I estimate them through data by using consistent estimators and derive a empirical Bayesian procedure. 2.2. Bayes Interval In history, there are many attempts to apply the decision Bayes approach to construct confidence sets/intervals. Faith (1976) first introduced a linear loss function for confidence set CI of the parameter 9 as L{9, CI) — kVolume{CI) — Ici{d) where the tuning parameter k was determined by some minimax rule in Casella and Hwang (1983). He (1992) used U{e^, CI,) = kLen{CIi)~ IciM) as the loss function for the interval estimator Cli of the parameter Oi. Hwang Qiu and Zhao (2008) modified the loss function above as L{9i, Cli) = ^ Len{C I,) — Icii (^i) s-iid constructed the double shrinkage confidence interval when assuming variances to be unequal and unknown. However, all these loss functions arc not appropriate for normal- mixture model ([1]). In fact, for any given confidence interval, one can construct a new interval, which is the union of the existing procedure and zero. This new approach boosts the coverage probability while causes no change of the length. Consequently, the conditional expected loss of the new construction is always less or equal than that of the original approach. As a result, the decision Bayes suggests that zero should be included in every interval. But practically, such constructions have no power and appear to be useless. In order to avoid this phenomenon, I put extra terms which influence the loss function only when the point zero is included and thus define the loss function imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 3 as, L{e„CI,) = kiLen{CIi)I{h = l)-IciM,h = l)+/c/. (0)(fe-/(/, = 0)),0 < fcs < 1- (2) The first two terms balance the length and the true coverage when the true parameters OiS are generated from the normal random variable. The tuning parameter fci will be determined later in this section. The last two terms affect the loss function only when the corresponding interval does include zero. Upon this, if Oi is indeed zero, then k2 — I{Ii = 0) = ^2 — 1 < 0, implying that including zero is useful. On the other hand, if Oi is not zero, then k2 — = 0) is positive and becomes a penalty term. Thus, appropriate choice of the tuning parameter k2 guides us to decide when zero should be included. Furthermore, the flexibility of choosing k2 offers us constructions under dif- ferent settings. For example, when assuming the normal-normal model, the loss function ([2]) reduces to He (1992) 's if I set ^2 = 0. In section [H I apply two different choices of ^2, one of which will reproduce Qiu and Hwang (2007) 's procedure and the other of which provides a construction that can control the Bayesian FCR at q-lcvcl. Now, I have all the pieces to construct the decision Baycs rule, i.e. I want to construct a Bayes interval Clf such that it minimizes E{L{9i,CIi\X)) for any observation X when assuming the normal-mixture model ([T]) and the loss function Theorem 2.1 Let tt^{X) = P{9, = 0\X) = P{I, ^ 0|X) and 7r,i(X) = 1 - 7rf'(X). Then EL{e„Ch\X) = t:}{X) [ (fci - n{e,\X,h = l))de, + IciMX){k2 - ^°(X)). •ICh (3) The Bayes interval is r {e, : fci < 7r{e,\X,,h = 1)} \ {0} tfk2 > nfiX), ^''-\ {9r.k,< 7r{9,\X,, 1,^1)} U{Q} ifk2<n^{X). ^ Intuitively, for any given observation Xi, if the conditional probability P{9i = 0|X) is small, it is very unlikely that Oi = Q and zero should be included. On the other hand, larger 'k'^\X) indicates that zero should be included. Theorem 12. II tells us that the parameter k2 is the threshold value. Under model (P), ■K{e,\X,h 1) N{MX,,Ma'^) where M = ^^q^, there- fore {9, : fci < T:{9,\X,,h = 1)} = {Or : {9,-MXif < -Ma^i2\ogkiV2^ + \ogMa^)}. As in the Section 3 of Hwang, Qiu, and Zhao (2008), one wants to obtain a traditional normal interval when the non- informative prior is applied, i.e., if setting T ^ oo, M —y 1, one wants the corresponding interval {9i : ^ ' ^.a < — (2 log A:i\/27r+ log cr^)} to coincide with normal interval (X^ — 2^/2^ , Xi + Zg/20-) imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 4 where Zg/2 is the critical value such that J^d^l > Zg/2) ~ q when Z is a standard normal random variable. Therefore, the constant ki should be chosen such that -(2 log fci V27r + logtT^). Plug this constant ki back to Bayes interval (|4]). z 2 9/2 Then the decision Bayes interval becomes ^jB _ / {(^^ ■■ ((^^ - MX,)^ < McT^zlf^ - logil/)} \ {0} if fc2 > ' ~ 1 {e, : {e, - MX,f < Ma^{zl^ - log A/)} U {0} if < nfiX). Unlike the interval MXi^v MaZq/2i which is directly derived from the posterior distribution, the major part of ([5]) has an extra positive term Ma'^{— log M) which is necessary to boost the coverage probability when the hyper parameters are estimated through the data in section [D In the next section, I will choose the value of the parameter ^2 under two different problem settings and derive the decision Bayes interval accordingly. 3. Choose ^2 3.1. Qiu and Hwang (2007) Qiu and Hwang (2007) constructed the interval for K selected populations X(p_if+i), ^(p_/f+2)7 •■• I ^(p) under the model ([T]) where the observations X(j)'s satisfy |^(1)|<|^(2)|<---<|^(P)|. Assume is the true parameter that corresponding to the observation ^(j). Note that \d{p)\ is not necessarily equal to max \0[j)\. I construct the interval for 9ij) where p — /v + 1 < j < p as : - MX(,)f < MaHz^,/2K ' ^^SM)} \ {0} if > n^^iX), : - AfX(,))2 < MaHzl^,^ - log M)} U {0} if < n^^^ {X). When compared with ([5|), the major difference is that I use the critical value Zq/2K to address the multiplicity. This is known as Bonferroni's correction. Direct calculation shows that for each j, P(%) i < q/K + t:1){X){I{^1-^{X) < k2) - q/K). Consequently, the simultaneous non-coverage coefficient satisfies P(0(,) i CI^,),j =p-K+l,-- ■ ,p\X) <q+ J2 '^0)(^)(^(^0)(^) < k2yi/K)- (7) If k2 is chosen to be the maximum k such that the summation above is non- positive, i.e. fc2=max{ ^2 ^UX)iMjM)<k)-q/K)<Q}. (8) j=p-K+l imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 5 Then the non-coverage coefficient P{6i^j) ^ CI(j),j = p — A" + 1, • • • ,p) is con- trolled at the g-level. Surprisingly, this choice of fc2 is exactly the same as in Theorem 4 of Qiu and Hwang (2007). Therefore, Qiu and Hwang (2007) 's Bayes procedure is exactly the same as ([6]). 3.2. Bayes FCR Controlling Interval Benjamini and Yekutieli (2005) initiated the concept of FCR, which is much less conservative than the simultaneous non-coverage coefficients. Zhao and Hwang (2008) have extended this idea to the Bayesian framework through a new con- cept, Bayes FCR. They have shown that there is a natural connection between the Bayes FCR and the Bayes confidence interval. In this subsection, I will show that ^ can control the Bayes FCR at the (/-level if ^2 is chosen appropriately. Theorem 3.1 Assume that TZ{X) is the index set of observations that are se- lected for interval estimation. R = 4j=TZ. Define /(p, r'. ... . EC£. ^mm^i^3^,^n > 0)), and k2 — maxjfc, /(p, t^, ttq. A;) < 0}. Then intervals (0) satisfies k FCR^ < qP{R > 0). In other words, the Bayes FCR of the intervals {J) has been controlled at q level. Now assume that the selection rule in Qiu and Hwang (2007) is applied, i.e., the last K observations after ordering Xi,X2, ■ ■ ■ ,Xp according to their abso- lute value increasingly are selected. Then f{p, r^, ttq, k) ~ -£'(X]i'=p-/-f+i — ^ '•^^ — ^). Comparing it with the expectation part in ([S]), / is always smaller when K > 1, which implies that the choice of ^2 for controlling Bayes FCR interval is always larger than the choice of Qiu and Hwang's. Consequently, under the same set- ting, the frequency that ([5]) includes zero is less than that of Qiu and Hwang's. Furthermore, since they addressed the multiplicity by Bonferroni's correction, the half length Ma^{zq/2K ~ logAf) is much larger than the half length of Bayes FCR controlling interval ([5]) and the discrepancy becomes large when K is big. These two facts all implied that Bayes FCR controlling interval is less conservative than Qiu and Hwang (2007). Another nice thing about this theorem is that it holds for any selection rule, including pre-determined and data-driven selection rule. For example, when ob- servations are selected according to Benjamini and Hochberg (1995)'s procedure which controls the False Discovery Rate at g-level and fc2 is simulated accord- ingly, the above theorem still guarantees that ([5]) controls the Bayes FCR at g-level. The choice of k2 depends on the unknown expectation, which prevents us from finding ^2 explicitly. However, k2 can be easily determined by simulation once the hyper-parameters are known. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 6 4. Empirical Bayes Approach In this section. I estimate unknown hyper-parameters through the data and ob- tain a practical confidence interval. Our goal is to construct the confidence inter- vals for selected parameters such that the Bayes FCR can always be controlled for a class of prior distributions which are determined by the hyper-parameters ttq and r^. This approach is named empirical Bayes FCR controlling intervals, according to Zhao and Hwang (2008). Recall the model [H Then EXf = a'^ + tt-^t"^ , and EXf = ?,{a'^ + 2771(7^2 -f TTiT^). By using the method of moments, one could get reliable estimators of ttq and when p is sufficiently large, '^i = TT, — 4 — = ' ■ (9) Plug these two estimators back to the function of / and simulate the value of fc2, which is denoted by ^2- Assume that M and 7r°(X) are the estimators of M and 7r-'(A) when ttq and are replaced by ([9]). Then I can construct the empirical Bayes interval as. CjEB^l {e.:{e.-MX,Y <Ma\zl,^- log M)}\{Q} \ik>f1{Xl \ {e,:{6,-MX,f <Ma\zl^-\ogM)}\J{Q} iffc2<^°m. ^ ^ The following theorem describes the asymptotic property of the construction. Theorem 4.1 For any < ttq < 1, > 0, i/Ve > 0, there 35, N > 0, such that \/p > N, k, k' > Q, (r'^ - r'^f + (tTq - ttq)^ + (fc' - kf < 8 implies \f{p,T'^n'„k')~ f{p,r\7To,k)\<e. (11) Then under the model {Ip, the empirical Bayes interval ilOfl satisfies lim sup i^Ci?7r < q. p — >oo Proposition 4.1 // the selected parameters are the first R parameters and R —^ 00 when p — > 00. Then f satisfies the condition ill]) . This proposition implies that when all observations are selected for interval estimation, ([TO]) can control the empirical Bayes FCR asymptotically. However, like all other existing constructions such as Casella and Hwang (1983), Qiu and Hwang (2007), and Hwang, Qiu, and Zhao(2008), the interval (jlO|) cannot provide a satisfactory answer automatically for the finite sample case. In figure [H I have plotted a figure of Bayes FCR of the empirical Bayes interval versus the procedure of B-Y under different settings of hyper-parameter (7ro,T^) when p = 1000 and only the top 100 observations are selected for interval estimation. B-Y's procedure can always control the FCR at the 5% imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 7 level; however, their procedures are way too conservative in terms of extremely low Bayes FCR when M is close to 1 and large average length. The green line, corresponding to the construction (jlOp . performs well when is relatively large; however some modifications are strongly required when is small. Qiu and Hwang (2007) have argued that ttq is nearly unidentifiable when T is small. This will cause the estimator (|9]) to be very inaccurate. There- fore, they mixed their empirical Bayes intervals with the Bonferroni correction {Xi — Zq/2p<T, Xi + Zq/2pCr) bascd on a threshold: min(-\/720/p, 0.6), obtained from extensive numerical calculations. It also seems necessary to mix the procedure (jlOp with the interval {Xi — Zi?g/2pCT, + zi^q/2pO'), which is inspired by B-Y (2005). The following analytic argument can help us to find the threshold value much easier than Qiu and Hwang (2007). Recall that EXf = + ttit^ and EXf = 3(ct'^ + 2TTia^T^ + ttit-*), therefore + 2fT^ = ^gx^-jg • Use TO2 = J2-^i/P ^'^'^ '^4 ~ J2-^i/P denote the second and fourth moments, then + 2(7^ = "'^/^^J^ ■ Since the left hand side is always greater or equal than 2cr^ , is not estimable when the right hand side is less than 2(t^. Therefore, I can carefully choose a proper Tq, such that the probability of the right hand side is smaller than 2a^, i.e. the probability that ttq and are not estimable, is controlled at the level of q. Therefore, set the threshold value t§ to satisfy P^2^^2( "^^^~2 < 2a^) < q. Now consider the special case when tti = 1 and calculate Tq . Use 7714 and m'2 to denote the second and fourth moments of the standard normal distribution when there are p observations. Then 7714 = (t^ +(7^)^7714 and 7712 = (t^ +<t^)7(72. I choose Tq such that P,.=,.((r2 + a^r^ - 2a\r^ + a^)m'2 + < 0) < g by simulation. Based on the cutoff, the final empirical Bayes FCR controlling interval with mixture is defined as _\ Xi ± zjiqn2p)cr a 7712 -a- <Tq, , . * ~ ( C/f , if 7772 - (7^ > Tq . In figure [TJ the red solid line corresponds to the above empirical Bayes inter- vals. They perform the same as BY when is very small because of the mixed procedure. The portion of the mixture increases when ttq increases. However, (|12p performs better than theirs when is larger. The discrepancy is significant when M ^ 1. I have also plotted the simulated average length in figure [2] that corresponds to the same model settings in figure [1] The average length of p2)) is uniformly less or equal than the average length of B-Y's procedure. The ratio can even be as small as 56%. In figures [3] and [H I repeat the simulation setting but change the selection rule to Benjamini and Hochberg (1995) 's procedure which aims at finding signif- icant observations while controlling the False Discovery Rate at 5%-level. The imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 8 intervals (fT2|) can control the empirical Bayesian FCR at 5%-level based on this data-driven selection. Compared with B-Y (2005) 's procedure, the improvement of the average length is even more significant than that corresponding to the fixed selection rule. The ratio can be as small as 43%. 5. Real Data Analysis In this section. I apply different intervals to a microarray data set, the Syn- teni data of Kerr, Martin, and Churchill (2000), which was revisited by Hsu et al.(2006) and Qiu and Hwang (2007). The description of the data set can be found in Kerr, et al. (2000). The figure 6 of Qiu and Hwang (2007) is a Q-Q plot of the ANOVA estimator Xg, which shows that the normal- mixture model ^ fits the data well. Hsu et al. (2006) uses simultaneous confidence intervals to detect genes with an expression level of A = 3 or more. I will first apply Benjamini and Hochberg (1995)'s procedure to select parameters with expression levels significantly larger or equal than log2 3, and then construct the simultaneous interval for such se- lected observations. B-H's procedure declares that the first 89 genes are signifi- cant. In figure [5l I construct the confidence intervals for these 89 genes by using Qiu and Hwang (2007) 's, B-Y's, Our confidence interval ^ for ^(g) is 0.93X(g) ± 0.96. Compared with the interval X(g) ± 1.47 of BY's procedure, 0.93X(g) ± 1.67 of Qiu and Hwang (2006), our intervals enjoy great length re- duction. 6. Discussion In this article, I have defined a new loss function for confidence interval con- struction when assuming the mixed prior model ([T]). I use two different ways to choose the tuning parameter in the loss function to obtain Qiu and Hwang (2007) 's procedure and the empirical Bayesian FCR controlling intervals. I con- clude that the new empirical Bayesian FCR controlling interval is better than other existing procedures because of the sharp improvement over the average length. However, there are still much need for further research. In model I assume equal and known variance cr^, which is not generally a practical as- sumption. Hwang, Qiu, and Zhao (2008) proposed a double shrinkage empiri- cal Bayesian interval for single parameter without selection under the normal- lognormal model. Therefore, one natural extension of this work is to consider the mixture-prior model when variances are unequal and unknown. The loss function ([2]) provides us a potential tool to construct corresponding intervals. 7. References Benjamini Y. and Hockberg Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 9 Royal Statistical Society, Series _B 57, 289-300. Benjamini, Y. and Yekutieli, D. (2005). False Discovery Rate- Adjusted Multiple Confidence Intervals for Selected Parameters. Journal of the American Statistical Association 100, 71-93. Casella, G. and Hwang, J. T. G. (1983). Empirical Bayes confidence sets for the mean of a multivariate normal distribution. Journal of the Amer- ican Statistical Association 78, 688-698. Ghoe, S. E., Boutros, M., Michelson, A.M., Ghurch, G. M. and Halfon, M. S. (2005). Preferred analysis methods for Affymetrix GeneGhips revealed by a wholly defined control dataset. Genome Biology 6, R16.1-16. Faith, R. E. (1976). Minimax Bayes point and set estimators of a multi- variate normal mean. Unpublished Ph.D. thesis. Department of Statistics, University of Michigan. Faith, R. E. (1978). Minimax Bayes point estimators of a multivariate normal mean. J. Multivar. Anal. 8 372-379. He, K. (1992). Parametric empirical Bayes confidence intervals based on James-Stein estimator. SiaiistzcaZ Decision 10, 121-132. Hsu, J. G., Ghang, J. Y., and Wang, T. (2006). Simultaneous confidence intervals for differential gene expressions. Journal of Statistical Planning and Inference 136, 2182-2196. Hwang, J. T. (1993). Empirical Bayes estimation for the mean of the selected populations. Sankhya (Series A) 55, 285-311. Hwang, J. T. G., Qiu, J., and Zhao, Z. (2008). Empirical Bayes confidence intervals shrinking both means and variances, to appear in the Journal of the Royal Statistical Society, Series B. Kerr, M. K. (2003). Linear models for microarray data analysis: Hidden similarities and differences. Journal of Computational Biology 10, 891- 901. Morris, G. (1983a). Parametric empirical Bayes confidence intervals. In Scientific Inference, Data Analysis and Robustness, Academic Press, New York, 25-50. Morris, G. (1983b). Parametric empirical Bayes inference: Theory and Applications. Journal of the American Statistical Association 78, 47-55. Qiu, J. and Hwang, J. T. G. (2007). Sharp simultaneous intervals for the means of selected populations with application to microarray data analysis. Biometrics 63, 767-776. Stein, G. (1981). Estimation of the mean of a multivariate normal distri- bution. Annals of Statistics 9, 1135-1151. Zhao, Z. and Hwang, J. T. G. (2008). Empirical Bayes PGR Gontrolhng Gonfidcnce Interval. Submitted. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 10 Supplementary Materials A.l. Technical Details of Mathematical Results Proof of Theorem 12.11 Firstly, ELi9,,Ch\X) (A.l) = kiLen{Ch)P{h = l\X) ~ j € Ch,h = l)m[e,\X)de, + /c/,(0|X)(fc2 - ^- (^)). The integration J IciA^^iili — ^)'m{(^i\X)d9i can be written as j^j m{9i,Ii = l\X)de, where n{e„I, = 1\X) = n}{X)7r{ei\I, = 1,X). Write Len{CIi) as J^j, Iddi. Then (jA.ip equals to ttUX) [ {ki-TT{e,\X,h^l))de, + IciMXKk2-Tr^iX)). (A.2) Jc'Ii Now consider two intervals CI^ and C/^^ where Clf = {0i : ki < Tr{9i\X,Ii = 1)} \ {0} and Clf = {9, : ki < TT{9i\X,I, = 1)} U {0}. Then both C// and Clf minimize the first term of the formula (|A.2[) . Since S Clf and ^ CI^ , then EL{Clf\X) = EL{Cll\X) + (fc2 - Consequently, the Bayes interval includes if and only if /c2 < ''^^{X)^ i.e. it is the one that is defined in (|4]). Proof of Theorem [3711 According to Zhao and Hwang (2008), y,,^^JP{9^<iCU\X) , FCR^ = E ^"^^ ^ ^ -^I{R > 0). R Since P{9, i Clf\X) = P{9, i Clf\X, I, - 0)P(/, = 0\X) + P{9, i Clf\X, h = \)P{h = \\X) = 7r,"(X)/(^O(X)<fe) + (l-7rO(X))P(0,^C/f|X, /, = !), (A.3) andP(&, ^C/f|X,/, = 1)<(7, FCR. < qE{I{R > o))+j; S»g^^»"(^)(-^( < " > q) = qP{R > 0)+f{k,). R The choice of k2 ensures that f{k2) < 0. Consequently, FCi?^ < qPiR > 0). Proof of theorem 14.11 Before the proof, I will state and prove the following lemma. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 11 Lemma A.l Assume that and ttq are consistent estimators of and ttq, then for any S > 0, there 3Po > such that Vp > Pqi |7r,"-7r°| <5, for all i = 1,2,- ■■ ,p. Direct calculation shows that tt? = — and tt? has the same form as tt? except that ttq and are replaced by their estimators ttq and f^. Now, I introduce an intermediate estimator 7f° where ttq is assumed known. I shall prove that the lemma holds for tt," first. Since f is consistent, M = rr^. — j- is also a consistent estimator of M. Then, for e = i < min(i^(5, — W^^), there exists N, such that Vp > N, \M - M| < eM. Without loss of generality, assume that M > M, i.e. 0<M-M<eM=^. Since M is a increasing function with respect to when cr^ is fixed, therefore > f^. Direct calculation shows that ~o 7ro^ia(^f^cxp(^^^)-l) < - < = ^ j—^ 17^3~ (tto Vcr^ + f2cxp( — + 7ria-)(7ro +7ri^7=^=5cxp(^^)) Since < M < M, < ' = - — ^ < 1. Consequently, t2 -u r2 1 _ M 0-2 + r2 0-2 + t2 1 - M ' Therefore, (|A.4[) implies that (ttoVct^ + cxp(-^^^) + 7ricr)(7ro + tti exp(^^^)) Since the numerator is negative and the denominator is larger than ttottict, TTQTTia 1 — M Furthermore, M — M > —eM implies that n^~n^>^^{-e)>-S. (A.5) On the other hand, .o_ ^ 7ro7rig(exp(^^) - 1) _ TToVg^ + exp[^^) - 1 imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 12 I use C to denote the constant ^"'^^^J''^^ ', and let y = exp(^^^), then exp(^^^) = y''. If Xi = 0, then y = 1, Otherwise, if Xi ^ 0, then y > 1, and Combine (jA.Sp and (|A.6p . then I*" - 7r°| < max{S, Ce) < S. (A.7) Now, assume that ttq is also estimated by ttq. Let A = —^^^^=exp{^^), then |7r?-^?| TTo TTo , I (tto - 7ro)A The denominator greater than ttottiA implies that Itt," — -ir"! < | \ ■ Since ttq is consistent for ttq, for any 5 > 0, there 3Po such that Vp > Pq, Ittq — ttqI < (5, then where D is a constant that only depends on ttq. Combining this with (|A.7p . one can get that Itt" - TT^\ < (1 + D)6, for alH = 1, 2, • • • ,p and completes the proof. Proof of the theorem According to Zhao and Hwang (2008), FCR^ = E ^'^"^ ^^'^'J'^^' '^-" /(i? > 0) where TZ is the set of index of parameters that are selected and R is the number of selected parameters, i.e. R = #7?.. Similarly as formula (|A.3p in the proof of theorem 13. 11 Pie,icif^\x) = n^{X)I{nUX) < k) + (1 - 7r^{X))P{e, i Clf^\X,U = 1) In the empirical Bayes interval (|10[) . there exists a positive correction term — M log Mcr^. Dropping this term results in a short interval which enlarges the non-coverage probability, i.e. P{e, i Clf'^lX) < P{\e, - MX,f > Ma'^zl,^). Consequently, P{e, i Clf^\X) < 7r«(X)/(^,«(X) < fc2)+(l-^O(A))P((0,-AfA,)2 > Ma^z'^^^\X,l, = 1). imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 13 Rearrange the terms in the above formula, one can simply the conditional non-coverage probability P{6i ^ Clf^\X) as 7rO(X)(/(7r°(X) < k) -q)+ 7r^{X)iq - P((a, - MX,)^ > Ma'z^^/^\X,h = 1)) +Piie, - MXif > M(T^zl/^\XJ, = 1). Let = R ' , E.eA ^nX){ci Pm MX,r > Ma^zl^\X, h - 1)) = R ' and A3 = E.eA pm - MX^f > Ma^zl^\X, h = 1) R then FCRtt can be controlled from above by -E(Ai + A2 + A3). Since ttq and are obtained by using the method of moments, Delta method imphes that ttq — tt = Op (;^ ) and — = Op{^). According to Lemma (jA.l[) . for any e > 0, I can always find sufficiently large Po, such that for any p > Po, {t^ - r'^f < S/3 and {TTf{X) - Trf{X))^ < S/3. Consequently, EAi < E s = /(p^ r , TTo, fc2 + V 'j/3)- R Since (f^ - r'^f + (7rO(X) - TT°{X)f + (,5/3)2 < therefore according to the property of the function /, f{p,T^,7To,k2 + \/5j3) < f{p,T'^,TTo,k2) + £ < 6, Since ^2 is simulated as the maximum k2 such that /(p, f^jTTo, k2) < 0, EAi < e. (A.8) For the second term A2, E^^A^KX)\q-Pm - MX,f > Ma^zl^\X,h = 1)1 IA2I < R T^,^Aq-pm-MX,Y < E^eA ll - pm - MX^Y > Ma^zl/,\X,h = 1)| R Taking a close look at the term P{{6i- MXiY > Ma^z^/^\X,Ii = 1), one knows that {di\Xi,Ii = 1) N{MXi, Ma'^). Therefore one can replace 9i by MXi + \fMaZ where Z is a standard normal random variable which is inde- pendent of Xi. Consequently, pm ^yx^r > m<j'zI/2\x,i. = i)=pi\z- i > V2i^)- (A.9) imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 14 Assume that is the observation that has the largest absolute value, then into the range Consequently, for any i = 1, 2, • • • ,p, (|X9l) falls Ma ' M Ti'' (A.IO) Let = Vcr2 + T-2Zi, then Zi = 7roiV(0, ^^72 ) + 7ri7V(0, 1). Furthermore 'Ma ■ T As a result, the range (jA.lOp can be rewritten as (f2+ (72) [P{\Z- 'T(f2 +0-2) ' M M Z(,)\\>\\l-z,/2),Pm>\l-r-'',/2)] M (A.ll) Since the above range applies for all i's, one knows that IA2I < max{\q-P{\Z- a{T^ - t2) r(f2 + 0-2) Since f2 - t2 = Op(^), = OiV^Wp), a{T^ — t'"' V(^2+^2)^(P)|-«P(1)- The dominated convergence theorem implies that Zip)\\ > l\/^V2)l. \<i-Pi\z\ > Y^g ^9/2)1)- (A.12) (A.13) P{\Z- a{f and \\l^z,^,))^P{\Z\> z,/2) = q, Applying the dominated convergence theorem again, one can deduce from (jA.12[) that limsupE'lAal < 0. (A.14) p — >oo Similar arguments apply to A3 and one can show that A. < P(\Z |(A/-M)X(rtl > Ma = P(\z-\^K-^M\> -Zq/2) imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 15 Dominated convergence theorem and (jA.13[l implies that hmsup£;|A3| < Um EP{\Z\ > Zq/2) = q. (A.15) p — >-oo P (|A.8p . (|A.14p . and (|A.15P imply that limsupFCi?7r < q- p — >oo Proof of the proposition 14. ll Assume that Xi - 7ro7V(0, a^) + (1 - 7ro)iV(0, + cr^) and K; - 7r[,7V(0, cr^) + (l-7r^)iV(0,T'2 + cr2) where i = 1,2,--- ,p. Then |/(p,7ro,r2,fc)-/(p,7r^„r'2,fc')l ^Ef=i vrf (X)(/(7rO(X) < fc) - g) - 7r,:0(y)(/K°(r) < fc') - q) gEtiK"(^) - <"(y)) + EtiK"(^)^K°(^) < fc) - < fcO) Since R goes to oo as p ^ oo, therefore by using the law of large number, the inside function of the above expectation converges to A = qE{TTi{X) — n'^ (Y)) + E{-kI{X)I{-k1{X) < /fc)-7riO(y)/(7rf (F) < k')) in probability. Since the integral is a bounded function, it is sufficient to show that Ve > 0, there exists 6, such that {k' - kf + (t'2 - r2)2 + {-k'^ - T^^f < 5 implies that |A| < e. In fact, ETr1{X)E{P{ea = 0\X)) = ^(6*0 = 0) = tto. This implies that qE{n",{X) - = q{no - tt',). (A.16) Furthermore, direct calculation shows that £;(7r?(X)/«(X) <k))= f 7Tl{X)m{X)dX = ^=L= exp(-|^)dx. Since {7rO(X) < k} implies that > 2^(log + log E{^l{x)i{^Ux) < fc) - < k')) ,9 2(7^,, 1 — A; , TTn ,9 2(7^,, 1 — fc' , - ^(1^1 > ^ + . /A^ )) - ^(1^1 > 177a«g ^ + log - ° k °7riVl-M M'^ " fc' TT^Vl -M' where A'^ is a standard normal random variable. When fc, fc' are close to 1, then logi^ ^ -00, therefore, \E{ttI{X)I{itI{X) < fc) - < < fc'))| = if fc, fc' > ei where ei < 1 is close to 1 sufficiently. Similarly, if fc, fc' are close to 0, then log ^ 00. I can choose sufficiently small eq, such that when fc, fc' < eo. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 16 Consequently, \E{t:\{X)I{'k\{X) < k) - < (r)/«(y) < k'))\ < e when k,k' are either close to or 1. Furthermore, assume that < cq < fc. A:' < ei < 1, then by the continuity of £;(7rJ(X)/(7r?(X) < A:) - Tr'^{Y)I{Tr'^(Y) < k')), there exists a small S < e, such that (fc' - fc)2 + (t'2 - t2)2 + {n'„ - Tro)^ < S implies t hat \E {TTf{X)I{Tr°{X) < k) - 7rf (r)/(7ri°(y) < fc'))| < e. Combining this with (|A.16p . one obtains that |A| < e when J is sufficiently small, which completes the proof. A. 2. Figures and Graphs imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 17 FCR:p=1000,piO=0.3,q=0.05 FCR:p.1000,piO=0.5,q.0.05 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 M M Fig 1 . This figures are the simulated Bayes FCR under different model settings against M = 2 Y^^. The dimension is set to be 1000, and top 100 observations after ordering all Xi's according to their magnitude are selected for confidence interval construction. The hyper parameter ttq varies among 0.3,0.5,0.8 and 0.9. The Bayes FCR level that I aim at is 5%. When is small, Ifl0\) doesn't control the Bayes FCR at 5%; however, the mixed procedure il2V does control the Bayes FCR for any hyper parameters. The portions of the mixture increase as no increases. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 18 Fig 2. This figures are the simulated average lengths of different approaches under the same model settings as figure Q] The average length of my procedure is less or equal than B- Y's procedure. In some extremely cases, the average length of 112\} is only 54% of that of B- Y's procedure. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 19 Fig 3. This figures are the simulated Bayes FCR under different model settings against M = 2 Yq-^ . The dimension is set to be 1000. The selection rule is based on Benjamini and Hochberg (1995)'s procedure which aims at controlling the False Discovery Rate to be less or equal than 5%. The hyper parameter ttq varies among 0.3,0.5,0.8 and 0.9. The Bayes FCR level that I aim at is 5% which is represented by the magenta line. When is small, UlUi l doesn't control the Bayes FCR; however, FCRs of the mixed procedure ilS\ } and B- Y's procedure are always less or equal than the error bar which equals to q plus the simulation error. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 20 leng:p=1000,pi0=0.3,q=0.05 leng:p=1 000,pi0=0.5,q=0.05 0.1 0,2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 M M Fig 4. This figures are the simulated average length of different approaches under the same model as figure Q] The average length of my procedure less than B- Y's procedure. In some extremely cases, the average length of (12V is only 44% °f ihat of B-Y's procedure. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 Z. Zhao/EB Interval 21 Fig 5. Three different interval approaches, Qiu and Hviang (2007), B-Y (2005), and fl2^} are applied to the Synteni data of Kerr, Martin, and Churchill (2000). B-H (1995)'s FDR procedure uihich aims at finding the genes with differentially expressed levels to be significantly larger or equal than log2 3 uihile controlling the False Discovery Rate to be less or equal than 5% is applied to select genes for interval estimation. Among 1285 genes, 89 of them are declared significant and the corresponding intervals are constructed and plotted in this figure. From the figure, one can see that the center of Qiu and Hwang (2007) 's procedure is the same as il2V . However, since they aim at controlling the simultaneous coverage coefficient by using Bonferroni's correction, lengths of their intervals are much larger than that of (12(1 . B-Y (2005) centers their intervals at the biased estimator Xf^i^ 's. Thus they end up correcting the selection bias by increasing the length and it turned out that their lengths are much larger than that of hl2\l . However, B-Y's length is slightly smaller than Qiu and Hwang (2007)'s procedure. imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009